大规模并行处理器编程：实践指南：异构计算环境：为何选择OpenCL？

在 同构计算——即单一CPU处理所有任务——的时代已达到其物理极限。如今，我们正处于一个异构计算环境性能由一系列专用硬件协同驱动的环境中：GPU用于高吞吐量计算，FPGA用于逻辑运算，DSP用于信号处理。

1. 向异构性的转变

现代计算性能的提升不再依赖于提高原始时钟频率，而在于集成专用 加速器。一个异构系统利用 主机（通常为多核CPU） 来协调跨多种 计算设备的任务，每种设备都具有独特的内存和执行特性。

2. OpenCL 设备模型

OpenCL（开放计算语言）提供了一个统一框架来管理这种多样性。它将每一块硬件都视为一个设备划分为 计算单元（CU）。通过平台层，开发者可以在运行时查询设备特定的能力，如时钟频率和内存大小，使同一段代码能够适应不同厂商的硬件。

3. 可移植性与效率

虽然OpenCL提供了 代码可移植性 （为所有厂商编写一个内核），但其真正强大之处在于 可移植的高效性：赋予开发者精细的控制能力，以针对每个独特平台的底层架构特点进行性能调优。

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

Read the “OpenCL Platform Layer” section of the OpenCL specification. Compare the platform querying API functions with what you have learned in CUDA.

CUDA and OpenCL both use a single function to find devices without vendor platforms.

OpenCL requires a hierarchical query (Platform then Device), while CUDA queries devices directly.

OpenCL cannot query device capabilities at runtime, whereas CUDA can.

OpenCL platforms are equivalent to CUDA streaming multiprocessors.

✅ Correct!

In CUDA, hardware discovery is simpler (cudaGetDeviceCount) because it targets one vendor. OpenCL requires clGetPlatformIDs (to find vendors like NVIDIA/Intel) and then clGetDeviceIDs to handle the heterogeneous landscape.

QUESTION 2

What is the primary role of the 'Host' in a heterogeneous system?

To perform all high-throughput mathematical calculations.

To act as the conductor, orchestrating tasks across specialized devices.

To replace the GPU for graphics rendering.

To provide power only to the FPGA.

QUESTION 3

How does OpenCL abstract hardware units like a Streaming Multiprocessor (SM)?

As a Processing Element (PE).

As a Compute Unit (CU).

As a Memory Bank.

As a Platform Identifier.

QUESTION 4

Why is 'Portable Efficiency' valued over simple 'Performance Portability' in OpenCL?

Because code that runs on everything automatically runs at peak speed.

Because it allows developers to tune code for specific architectural nuances while keeping the source portable.

Because it removes the need for kernel optimization.

Because OpenCL only supports CPUs.

QUESTION 5

Which OpenCL constant is used to query for any hardware device type (CPU, GPU, etc.)?

CL_DEVICE_TYPE_GPU

CL_DEVICE_TYPE_ALL

CL_DEVICE_VENDOR_ONLY

CL_PLATFORM_ALL

✅ Correct!

CL_DEVICE_TYPE_ALL allows the host to discover all supported compute devices in the heterogeneous system.